Genetic Diversity and Classification of Colletotrichum sublineola Pathotypes Using a Standard Set of Sorghum Differentials

Anthracnose, incited by Colletotrichum sublineola, is the most destructive foliar disease of sorghum and, under severe conditions, yield losses can exceed 80% on susceptible cultivars. The hyper-variable nature of the pathogen makes its management challenging despite the occurrence of several resistant sources. In this study, the genetic variability and pathogenicity of 140 isolates of C. sublineola, which were sequenced using restriction site-associated sequencing (RAD-Seq), resulted in 1244 quality SNPs. The genetic relationship based on the SNP data showed low to high genetic diversity based on isolates’ origin. Isolates from Georgia and North Carolina were grouped into multiple clusters with some level of genetic relationships to each other. Even though some isolates from Texas formed a cluster, others clustered with isolates from Puerto Rico. The isolates from Puerto Rico showed scattered distribution, indicating the diverse nature of these isolates. A population structure and cluster analysis revealed that the genetic variation was stratified into eight populations and one admixture group. The virulence pattern of 30 sequenced isolates on 18 sorghum differential lines revealed 27 new pathotypes. SC748-5, SC112-14, and Brandes were resistant to all the tested isolates, while BTx623 was susceptible to all. Line TAM428 was susceptible to all the pathotypes, except for pathotype 26. Future use of the 18 differentials employed in this study, which contains cultivars/lines which have been used in the Americas, Asia, and Africa, could allow for better characterization of C. sublineola pathotypes at a global level, thus accelerating the development of sorghum lines with stable resistance to the anthracnose pathogen.


Introduction
Sorghum (Sorghum bicolor (L.) Moench) is a versatile crop, and its adaptability in marginal agro-ecological zones makes it indispensable for people and animals living in dry tropical regions [1][2][3][4][5].Among cereals, sorghum acreage and production rank behind those of maize, rice, wheat, and barley, and its uses include human consumption, notably, in the health food industry, animal feed, and biofuel [1,4,6,7].The adaptability of sorghum to a wide range of environments exposes the crop to diverse abiotic and biotic stresses [1,8].Abiotic stresses such as drought and high temperatures are critical factors that limit sorghum yield performance in drier tropical zones [8].Among the biotic stresses, Colletotrichum sublineola (formerly C. graminicola P. Henn in Kabàt and Bubk), the causal agent of sorghum anthracnose, is the most important foliar disease in sorghum, because the pathogen infects all the above-ground plant parts such as panicle, stalk, and grain [9,10].The foliar phase of the disease is the most damaging, resulting in yield losses of up to 86% [11].Infection of the stalk results in stalk rot, which may lead to lodging and lower harvestable biomass [12], while panicle infection can result in grain losses of up to 50% [9].Management options for sorghum anthracnose include crop rotation, application of fungicides, and the use of resistant cultivars [9,10,13,14].The use of resistant cultivars is the most effective strategy for controlling anthracnose because it lowers production costs and is environmentally friendly [4,9,10,13,15].However, the hypervariable nature of the C. sublineola pathogen requires selection for resistance based on the specific pathotypes in a target environment [9,[15][16][17][18].In India, Pande et al. [16] tested the pathogenicity of nine anthracnose isolates on thirty sorghum genotypes and reported nine distinct pathotypes.Moore et al. [17] evaluated ninety-eight isolates from Arkansas, USA, on eight sorghum lines and documented thirteen different pathotypes.Although there a morphological variations among the isolates of C. sublineola, these characteristics do not elucidate the pathogenic differences in the host-pathogen interaction [9,14,16,18].Recently, Koima et al. [14] evaluated seven C. sublineola isolates with different morphological and cultural characteristics using a detached leaf assay and found no differences in pathogenicity on the sorghum cultivar Kateng'u.
Due to environmental influences on the stability of morphological traits, differentiation among Colletotrichum isolates based on conidial morphology, such as colony color, size, shape, or host origin, is insufficient to assess genetic diversity.Hence, molecular markers have been used to examine diversity in the pathogen [18].Over the years, the genetic diversity of C. sublineola isolates has been reported by several researchers using polymorphic DNA markers such as restriction fragment length polymorphism (RFLP), random amplified polymorphic difference (RAPD), and amplified fragment length polymorphism (AFLP) [14,[18][19][20][21][22][23][24].Prom et al. [18] reported high variability among 232 C. sublineola isolates collected from the U.S. and Puerto Rico using AFLP analysis, while Chala et al. [25] noted the existence of diversity among 22 isolates collected from a single sorghum field in Ethiopia.A total of 384 isolates collected from sorghum and Johnsongrass (Sorghum halepense (L.) Pers.) from the U.S., Burkina Faso, Zambia, South Africa, Sudan, Brazil, and Puerto Rico were characterized using RFLP and RAPD fingerprinting by Xavier et al. [24].While many studies have been completed, none have been completed with single nucleotide polymorphism (SNP) markers from next-generation sequencing (NGS) facilities.
To effectively deploy resistance sources, knowledge of the pathotypes of C. sublineola in a region is essential.The existence of pathotypes and environmental factors can partially elucidate the differential reactions of sorghum lines that are deployed in different production regions or evaluated in different fields [8,18].Given that new virulent pathotypes of C. sublineola will occur, their monitoring is of paramount importance for host plant resistance in sorghum.In this current study, the aim was to determine the range of pathotype variation in C. sublineola using a set of sorghum genotypes known to react differently to anthracnose (Table 1) and sequence 140 C. sublineola isolates to determine genetic diversity through SNP markers.2.2.Fungal DNA Extraction, Restriction Site-Associated Sequencing (RAD-Seq), and Phylogeny Reconstruction DNA samples from fungal isolates were obtained using the method described by Prom et al. [18].In brief, mycelium was rinsed 2 to 3 times with 0.1 M MgCl 2 and dried (10-15 min) using a Savant SpeedVac DNA 110 (GMI, Ramsey, MN, USA).A MasterPure TM Yeast DNA Purification kit (Biotechnologies, Thermo Fisher Scientific, Austin, TX, USA) was used to extract DNA from 138 isolates of C. sublineola as well as JG1 and JG2 from Johnsongrass in Corpus Christi, Texas.After a quality and quantity check, all the DNA samples were sent to the Genomics and Bioinformatics Service component of Texas A&M AgriLife Research for restriction site-associated sequencing (RAD-Seq).Each sample was bar-coded and sequenced from each end of the restriction fragments using ILLUMINA technology, San Diego, CA, USA).The number of approximately 120 base pair reads per sample ranged from 780,000 to nearly 7 million (12X coverage).These were provided pre-screened to assure high quality, with the primer adaptor and barcode sequences already stripped.Tools in the CLC Genomics Workbench (v8) were used to align the sequences from each read to the sequenced contigs from a rough draft of the C. sublineola genome entered into GenBank by Baroncelli et al. [34].A subset of 1244 SNPs with missing data < 10% and minor allele frequency > 0.05 were retained for a population and phylogenetic analysis.

Population Structure and Cluster Analysis
The population structure of the C. sublineola isolates was determined using the modelbased clustering method implemented in STRUCTURE 2.1 [35].Ten independent runs using an admixture model with correlated frequencies, 25,000 burn-in periods, and 125,000 Monte Carlo Markov Chain (MCMC) were completed for each k value set from 1 to 13.The ad hoc statistic ∆k based on the rate of change in the log probability of data [36] and the observed convergency in the mean of the log probability of the data between successive k values, both as implemented by the Structure Harvester software (https://taylor0.biology.ucla.edu/structureHarvester/)[37], were used to fraction the genetic variance into populations.The ten independent runs of the selected k values were matched in CLUMPP [38] to obtain the ancestry membership coefficient of each isolate.The isolates with an ancestry coefficient > 0.75 were assigned to their corresponding population.A principal component analysis (PCA) was conducted in Tassel 5.0 (TASSEL-GBS), which, according to Glaubitz et al. [39], allows for high-throughput genotyping of large numbers of individuals at a considerable number of SNP markers.
The identical-by-state (IBS) genetic distances among the 140 C. sublineola isolates were calculated in Tassel 5.0 and subjected to a clustering analysis using neighbor-joining.The phylogenetic tree was visualized using Interactive Tree of Life [40].

Greenhouse Experiment
The protocol for the greenhouse experiment, inoculum preparation, inoculation, and disease assessment were described by Prom et al. [18,41].Briefly, the greenhouse experimental design was a split-plot with 30 C. sublineola isolates as the main plot and 18 sorghum differentials as the sub-plot.Seeds from each differential were planted at a rate of eight seeds per tall tree pot (4 ′′ × 14 ′′ ) (Hummert International) with metro mix 200 (BWI) containing potting soil mixed with osmocote classic fertilizer 17-7-12 (O.M.Scott & Sons Company, Marysville, OH, USA).Each differential line (RTx2536, SC748-5, BTx398, TAM428, RTx430, Brandes, SC112-14, Theis, BTx378, SC326-6, SC283, BTx623, SC328C, SC414-12E, PI570841, PI570726, PI569979, IS18760) was replicated three times.To accommodate the space in the greenhouse, four tall tree pots were placed in 3-gallon poly-trainer cans (10 ′′ × 91/2 ′′ × 85/8 ′′ ) (Hummert International).Germinated plants at the three-leaf stage were thinned to four plants per pot.A total of 200 mL of Peters Excel 15-5-15 (O.M.Scott & Sons Company, Marysville, OH, USA) multi-purpose fertilizer was applied to each tall pot on a bi-weekly basis pre-inoculation.At the eight-leaf stage, eight C. sublineola-colonized seeds were placed in each plant whorl and, later in the evening, the plants were inoculated with 1 × 10 6 conidia/mL suspension until run-off with their respective isolate.To create a favorable condition for disease development, the plants were misted for 30 s at 45 min intervals for 8 hrd −1 for one month.The experiments were repeated twice.

Disease Assessment and Data Analysis
The plants were assessed for anthracnose infection twice, 30 days post-inoculation and a week later, using the Prom et al.'s [18,41] disease rating scale 1-5, as follows: 1 = no symptoms or chlorotic flecks on leaves; 2 = hypersensitive reaction (reddening or red spots) on inoculated leaves but no acervuli formation; 3 = lesions on inoculated and bottom leaves with acervuli in the center; 4 = necrotic lesions with acervuli observed on inoculated and bottom leaves with infection spreading to middle leaves and not yet on the flag leaves; and 5 = most leaves dead due to infection with infection on the flag leaf containing abundant acervuli.The symptom types were then categorized into the following two reaction classes: resistant = rating 1 or 2; and susceptible = rating 3, 4, or 5.The data on the anthracnose rating were analyzed using the command PROC ANOVA (SAS Institute, SAS version 9.4, Cary, NC, USA).

Results
The SNP data from 140 C. sublineola isolates were only partially grouped by origin (Figure 1).For example, isolates from Burleson County, Texas, were interspersed among the isolates from Puerto Rico, while the isolates from Wharton County, Texas, were grouped in with the isolates from Georgia and North Carolina (Figure 1).The isolates from Puerto Rico showed the most widespread genetic diversities, and the isolates from Johnsongrass, JG-1 and JG-2, grouped together with a 100% bootstrap consensus value and were close to multiple isolates from Georgia.
and a week later, using the Prom et al.'s [18,41] disease rating scale 1-5, as follows: 1 = symptoms or chlorotic flecks on leaves; 2 = hypersensitive reaction (reddening or r spots) on inoculated leaves but no acervuli formation; 3 = lesions on inoculated and b tom leaves with acervuli in the center; 4 = necrotic lesions with acervuli observed on ino ulated and bottom leaves with infection spreading to middle leaves and not yet on the fl leaves; and 5 = most leaves dead due to infection with infection on the flag leaf containi abundant acervuli.The symptom types were then categorized into the following two action classes: resistant = rating 1 or 2; and susceptible = rating 3, 4, or 5.The data on t anthracnose rating were analyzed using the command PROC ANOVA (SAS Institute, SA version 9.4, Cary, NC, USA).

Results
The SNP data from 140 C. sublineola isolates were only partially grouped by orig (Figure 1).For example, isolates from Burleson County, Texas, were interspersed amo the isolates from Puerto Rico, while the isolates from Wharton County, Texas, we grouped in with the isolates from Georgia and North Carolina (Figure 1).The isola from Puerto Rico showed the most widespread genetic diversities, and the isolates fro Johnsongrass, JG-1 and JG-2, grouped together with a 100% bootstrap consensus val and were close to multiple isolates from Georgia.

Genetic Diversity of C. sublineola
The genetic diversity of C. sublineola varied across locations.The population structure analysis based on ∆k stratified the genetic diversity into two large populations (Figure 1).We observed that the isolates from Puerto Rico and some from Texas were genetically related and clustered into one population, while the other isolates from Georgia, North Carolina, and Texas constituted another population.To obtain additional insight into the genetic variation of C. sublineola, the genetic variation was also stratified into eight populations (105 isolates) and one admixture group (35 isolates) based on the mean variation of the log probability of the data (Figure 2).This analysis showed that isolates from Georgia, North Carolina, and Texas could be separated into six groups, the two isolates from Johnsongrass in one group, and the isolates from Puerto Rico and Texas in one large group.This population structure was also observed in the principal component analysis, in which isolates from Johnsongrass were located at the center, surrounded by a group of isolates from Georgia and North Carolina.Remarkably, the isolates from Tifton and Cairo, GA, constitute two distinct groups, suggesting that both exhibit unique genetic variation.

Genetic Diversity of C. sublineola
The genetic diversity of C. sublineola varied across locations.The population structure analysis based on Δk stratified the genetic diversity into two large populations (Figure 1).We observed that the isolates from Puerto Rico and some from Texas were genetically related and clustered into one population, while the other isolates from Georgia, North Carolina, and Texas constituted another population.To obtain additional insight into the genetic variation of C. sublineola, the genetic variation was also stratified into eight populations (105 isolates) and one admixture group (35 isolates) based on the mean variation of the log probability of the data (Figure 2).This analysis showed that isolates from Georgia, North Carolina, and Texas could be separated into six groups, the two isolates from Johnsongrass in one group, and the isolates from Puerto Rico and Texas in one large group.This population structure was also observed in the principal component analysis, in which isolates from Johnsongrass were located at the center, surrounded by a group of isolates from Georgia and North Carolina.Remarkably, the isolates from Tifton and Cairo, GA, constitute two distinct groups, suggesting that both exhibit unique genetic variation.The phylogenetic analysis was consistent with the population structure analysis (Figure 2).The isolates from Puerto Rico and Texas were clustered into one main clade, separated from other clades by admixtures isolates.The isolates from Tifton, GA, were the most genetically related to the isolates from Johnsongrass.We observed that the isolates from Cairo, GA, are distributed into three clades, of which one includes three isolates from North Carolina.Certainly, the genetic variation of C. sublineola isolates is associated with the agri-environmental niches of each location.
The sorghum differential lines SC748-5, SC112-14, and Brandes were resistant to all the isolates evaluated, while BTx623 was susceptible to the same isolates.The host differential line QL3 (India) was resistant to all the isolates except for FSP70 from Puerto Rico and FSP237 from Texas (designated as pathotypes 3 and 13, respectively).The host differential PI570841 was susceptible to all the isolates except for FSP53 from Texas (designated as pathotype 2) and FSP208 from Puerto Rico (designated as pathotype 11).Puerto Rico and Texas were genetically related, while isolates from Georgia and North Carolina constituted another main population.This could be due to the historic exchange of sorghum germplasm between Puerto Rico and Texas, while Georgia and North Carolina are geographically in close proximity.Likewise, the isolates from Georgia showed similar virulence patterns to each other; as an example, even though not identical, the isolates from Georgia that formed a group, FSP279, FSP280, and FSP281, showed similar virulence patterns.Similarly, Georgia isolates FSP276, FSP277, and FSP278 showed high genetic similarities and pathotypes.In contrast, FSP76, FSP92 (Puerto Rico), and FSP265 (North Carolina) were all grouped in pathotype 5, but FSP265 was not shown to be genetically close to the other two isolates.
The isolates from Puerto Rico and Texas were highly diverse, while the isolates from Georgia and North Carolina were less so.The diversity of the pathogen in Puerto Rico, a tropical region where conditions are more favorable for anthracnose development, coupled with the fact that the isolates were collected from test plots planted with diverse sorghum germplasm could partly explain the high variability within the isolates.In contrast, the proximity of Georgia and North Carolina with similar climatic classification as well as similar sorghum hybrids in the regions may elucidate the low variability of pathogenic population.The isolates from Georgia were clustered in a tighter group with lower levels of variability.This could be attributed to the fact that the isolates were collected from the same climatic zone with the same cropping system and hybrids and possibly low prevalence and intensity of sorghum anthracnose.
In addition, other species in the genus Colletotrichum cause anthracnose on many economically important plants, including chili (Capsicum spp.), mango (Mangifera indica), orange (Citrus spp.), and strawberry (Fragaria ananassa) [47].Within several Colletotrichum spp., pathogenic variation based on pathogenicity on sets of host differentials has been documented [48,49].In a previous study by Prom et al. [18], 17 pathotypes were established from 20 diverse isolates using 18 sorghum differentials, including nine lines previously used by Casela and Ferreira [26].In the current study, 27 new pathotypes were distinguished using 30 sequenced diverse isolates collected from Georgia, North Carolina, Puerto Rico, and Texas and evaluated with the same 18 sorghum differentials used earlier by Prom et al. [18].Similar host-Colletotrichum spp.studies in Brazil, resulted in five pathotypes of C. graminicola when the virulence pattern of 190 isolates on 15 maize differentials was observed [49].Montri et al. [48] documented three pathotypes of C. capsici out of eleven isolates using nine chili differentials.In this study, Brandes, SC748-5, and SC112-14 were resistant to all the C. sublineola isolates tested.However, Tsedaley et al. [50] observed that SC748-5 was susceptible to one of the five isolates from Ethiopia tested in the greenhouse.
Although only 30 isolates collected from four climatic zones were tested, many pathotypes were identified, confirming the hyper-variable nature of C. sublineola.Yet, no association was either noted or inference made between specific climatic zones and pathotype.In the present study, the isolates within each population group revealed high levels of variability for the genes affecting pathogenicity on the sorghum differentials.Similarly, high levels of variability have also been noted among 232 C. sublineola placed in four clusters based on AFLP analysis [18].Using RAPD and RFLP-PCR markers to evaluate the genetic diversity among 37 sorghum anthracnose isolates collected from Brazil, Valèrio et al. [22] observed no association between virulence patters and molecular profiles.Also, a RAPD analysis of 19 C. lupini isolates detected high intraspecific genetic diversity with marked differences in pathogenicity on susceptible cultivar 'kiev mutant' [51].However, the clustering of Xanthomonas translucens pv.undulosa and X. translucens pv.translucens based on multilocus sequencing typing and multilocus sequencing analysis showed correlations among the strains and levels of virulence on inoculated wheat and barley [52].This study suggests that molecular tools to determine genetic diversity could be used to predict relative virulence on some host pathosystems.In the sorghum anthracnose pathosystem, Chala et al. [23] suggested that certain factors such as geographic separation, diverse sorghum lines planted, and the different agro-ecological zones where the crops are planted may contribute to the evolution and diversity of C. sublineola.However, other mechanisms that contribute to fungal population diversity include mutation, sexual reproduction, gene gain or loss, gene family expansion and contraction, transposable elements, loss of heterozygosity, copy variation, etc. [53,54].Some, if not all, of these mechanisms may also be operating in the C. sublineola pathogen population.
Further, due to the existence of a large number of C. sublineola pathotypes, continuous evaluation of sorghum germplasm and robust monitoring of any changes in the pathogenic population, coupled with the use of a standard set of differentials to compare pathotypes, would help researchers identify stable sources of anthracnose resistance.Additionally, crosses among sorghum differentials and the study of their inheritance may lead to our understanding of whether the gene-for-gene concept operates in this host-pathogen interaction.

Figure 1 .
Figure 1.Population structure analysis of one hundred and thirty-eight C. sublineola isolates c lected in sorghum fields in Georgia (GA), Texas (TX), North Carolina (NC), and Puerto Rico (P and two isolates collected from Johnsongrass in Corpus Christi, Texas, using 1244 SNPs.(A) E mation of the number of populations in the 140 C. sublineola isolates based on the analysis in STRU TURE, with Δk values (Axis 1; black dashed line) and the estimate LN probability of data (axis different color line per each STRUCTURE run) using 10 runs for each K values from 1 to 13; Hierarchical organization of genetic relatedness of 140 C. sublineola isolates for K values of 2 and (C) Principal component analysis of the 105 C. sublineola isolates present in the eight populatio found in the STRUCTURE analysis.

Figure 1 .
Figure 1.Population structure analysis of one hundred and thirty-eight C. sublineola isolates collected in sorghum fields in Georgia (GA), Texas (TX), North Carolina (NC), and Puerto Rico (PR) and two isolates collected from Johnsongrass in Corpus Christi, Texas, using 1244 SNPs.(A) Estimation of the number of populations in the 140 C. sublineola isolates based on the analysis in STRUCTURE, with ∆k values (Axis 1; black dashed line) and the estimate LN probability of data (axis 2; different color line per each STRUCTURE run) using 10 runs for each K values from 1 to 13; (B) Hierarchical organization of genetic relatedness of 140 C. sublineola isolates for K values of 2 and 9. (C) Principal component analysis of the 105 C. sublineola isolates present in the eight populations found in the STRUCTURE analysis.

Figure 2 .
Figure 2. Unrooted neighbor-joining tree for 140 C. sublineola isolates collected in Georgia, Texas, North Carolina, and Puerto Rico.Colored branches represent isolates belonging to the eight populations found in the STRUCTURE analysis, while admixture isolates are not colored.The red stars represent isolates used for a virulence analysis against 18 sorghum differential lines.

Figure 2 .
Figure 2. Unrooted neighbor-joining tree for 140 C. sublineola isolates collected in Georgia, Texas, North Carolina, and Puerto Rico.Colored branches represent isolates belonging to the eight populations found in the STRUCTURE analysis, while admixture isolates are not colored.The red stars represent isolates used for a virulence analysis against 18 sorghum differential lines.

Table 1 .
Sorghum differentials used in prior studies, number of pathotypes identified, and references a .

Table 2 .
Details of Colletotrichum sublineola isolates evaluated in this study a .

Table 3 .
Analysis of variance for the severity ratings of the thirty Colletotrichum sublineola isolates inoculated individually on the eighteen host differentials.